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Abstract 

This  report  describes  the  progress  on  the  AFOSR  research  contract  on  the  design 
and  training  of  limited-interconnect  neural  architectures.  A  novel  local  training  rule  has  been 
derived,  an  information  theoretic  analysis  to  determine  capabilities  and  limitations  of  multi¬ 
layered  limited-interconnect  neural  networks  was  developed,  and  the  hardware  implementation 
of  local  training  rules  analyzed. -  •  *  — 

I.  Introduction 

In  recent  years,  there  has  been  a  resurgence  of  interest  on  neural  networks  and  models. 
The  potential  of  massive  parallelism  of  such  networks  have  generated  this  interest..  Neural 
networks  are  best  suited  for  such  applications  as  pattern  recognition  and  signal  processing. 
Neuromorphic  (brain  like]  models,  allow  an  alternative  for  achieving  real-time  operation  for 
such  tasks,  while  having  a  compact  and  robust  architecture. 

Neuromorphic  models  consist  of  interconnections  of  simple  computational  nodes.  In  this 
approach,  each  node  computes  a  weighted  spatio-temporal  integration  of  its  inputs,  and  makes  a  . 
decision  based  on  this  integration.  Spedfication  of  a  neural  network  model  will  consist  of  two  . 
parts.  The  first  part  is  the  architecture  specification.  This  includes  determining  the  number  of 
nodes  and  the  interconnection  pattern  of  these  nodes.  The  second  spe^ication  task  is  providing  a„ 
training  rule.  The  purpose  of  ine  training  rule  is  to  allow  a -network  of  nodes  to  perform  .in  a  - 
desired  collective  manner.  This  approach  offers  many  advantages  such  as:, 

*1.  Since  the  nodes  in  a  neural  network  may  operate  locally  on  the  incoming  information,  a  high 
degree  of  parallelism  is  achieved.  The  requirement  of  an  adaptation  rule,  for  a  network  of 
interconnected  neurons,  is  one  of  the  many  advantages  of  neural  networks  over  existing 
architectures. 
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2.  Due  to  distributed  representation  of  information  in  neural  networks,  failure  of  a  node  or  a 
connection  will  not  cause  severe  degradation  of  the  network  performance.  Therefore,  a  highly 
robust  system  can  be  obtained. 

3.  Since  the  computations  are  done  in  parallel,  real-time  hardware  operational  performance 
can  be  achieved. 

At  present,  most  multi-layered  neural  network  models  assume  full  inter-layer 
connectivity.  This  requirement  severely  limits  the  maximum  size  of  VLSI  or  system 
implementation  of  such  architectures.  Moreover,  such  issues  as  the  limited  resolution  of 
weights  and  local  computations,  will  make  many  neural  models  unattractive  tor  hardware 
implementation.  Finally,  scaling  property  for  larger  size  networks,  need  to  be  taken  into 
consideration. 

In  an  effort  to  overcome  the  above  difficulties,  our  research  efforts  have  been  focused  on 
neural  networks  that  can  be  implemented  in  VLSI  hardware.  This  is  especially  important  for 
real-time  operational  performance. 

It.  Research  Objectives 

The  research  objectives  were: 

1.  Development  of  on-chip  local  training  rules  specifically  d.esigned  for  optimal  operation  in 
hardware. 

2.  Use  information  theoretic  analysis  to  determine  capabilities  and  limitations  of  multi¬ 
layered  limited-interconnect  neural  networks,  and 

3.  Analysis  of  hardware  implementation  issues  such  as  finite  precision  of  weight  storage  and 
multiplications,  compact  cell  development,  scaling  of  architectures  to  large  numbers  of 
neurons,  and  fault  tolerance. 

A.  Information-Theoretic  Analysis  of  Limited-Interconnect,  Finite  Parameter 
Neural  Networks 

Information  theory  provides  a  formalism  for  the  analysis  of  network  architecture  and 
adaptation.  It  can  be  demonstrated  that  a  unifying  organizing  principal  of  network  internal 
representations  is  maximum  entropy.  For  example,  processing  for  improved  linear 

separability,  increased  data  orthogonality,  projection  of  data  onto  higher  dimensional  space,  are  -  - 
all  specific  instances  of  increased  entropy  data  representations. 

Increased  entropy  can  be  shown  to  be  an  artifact  of  parameter  constraint  •  eg.  rouifo  oft  will  '.~ 
increase  entropy.  An  optimal  hidden  unit  representation  (achieved  bynncreasing  the  entrc^  of 
the  input)  can  be  accomplished  independent  of-.,  the  network  implementation  -  \ 

parameters,  such  as  connectivity  and  resolution.  • -  :  -r.--'  — ’  . 

□ 

Traditionally,  neural  networks  are  used  as  'black  boxes”  where  optimal  network  hidden  unit  .j 
size  is  determined  experimentally.  By  employing  entropy  as  a  qualitative  measure  of  hidden 
layer  performance,  an  optimal  network  architecture  may  be  obtained  by  'stacking*  elements 
until  entropy  is  maximum  for  the  given  training  set.  This  may  permit  substantial  resource 
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savings:  by  employing  limited  fan-in  nodes,  an  optimal  hidden  unit  representation  (the 
layer  of  maximum  entropy)  may  be  discovered  beiow  the  point  of  full  connectivity 
(each  node  in  a  given  layer  receives  a  signal  for  all  input  bits). 

B.  Reasons  for  Local  Unsupervised  Adaptation  Ruies 

Local  connectivity  on  the  input  field  is  found  commonly  in  the  biological  receptive  field 
allowing  a  single  trend,  mode,  frequency,  feature,  etc.  to  represent  a  subset  of  the  input. 
Supervised  adaptation  requires  global  routing  of  error  signals  consuming  large  areas  of  the 
semiconductor  chip. 

C.  Advantages  of  Local  Unsupervised  Adaptation  Rules 

A  local  processor's  operation  can  be  optimized  with  unsupervised  adaptation.  Theoretical 
analysis  and  designs  are  facilitated  when  tow  fan-in  modular  processing  elements  can  be  used. 
Neuromorphic  computation  emerges  when  local  processors  are  connected. 


ill.  Accomplishments 

•  We  have  derived  a  new  local  unsupervised  training  rule  based  on  the  optimization  of  the 
mutual  information  among  the  Inputs  to  a  computing  node. 

•  A  network  of  neurons  with  weights  trained  using  this  local  training  rule  allows  the 
integration  of  local  information  to  arrive  at  the  global  representation  of  the  information. 

•  We  have  also  shown  that  for  various  arrangements  of  the  weights,  optimal  response  to 
certain  classes  of  patterns  can  be  obtained. 

•  The  neural  network  developed  with  this  on-chip  training  rule  requires  only  local 

connections  between  neurons,  is  highly  parallel  in  nature,  and  allows  continuous  and  real¬ 
time  adaptive  VLSI  implementation.  ~ 

•  The  advantage  of  this  model  over  other  existing  models  is  that  a  multitude  of  behavior  such 

as  edge  detection,  image  compression,  and  dynamic  object  detection  can  all  be  achieved  with 
the  same  architecture  in  continuous  time.  Other  advantages  are  the  compact  size,  and 
robustness  to  random  noise.  - — 

•  We  have  also  developed  a  real-time  VLSI  smart  sensor.  The  device  consists  of  a  low-  . 

resolution  optical  sensor  array,  and  a  high-resolution  optical  feature  detector.  : 

•  A  smart-scan  mechanism  is  emptoyed,-in  which  the  low-resolution  sensor  array  examines 

regions  of  the  object's  imape  tor  the  presence  of  comers.  -  Once  .it  detects  the  presence  of  a-f-vr 
comer,  a  high-resolution  feature  detector  which  is  analogous  to  the  high-resolution  fovea  of  trte 
retina,  scans  the  region."  - ~  - 

•  The  high-resolution  detector  develops  an  encoding  of  the  image  by  sensing  and  storing  all 
corners  writhin  the  object  and  their  respective  orientations.  The  orientations  and  angles  of  the 
corners  in  an  object  may  be  employed  to  generate  a  nearly  unique  abstraction  of  the  object.  This 
scheme  is  particularly  useful  as  comers  are  spatially-localized  and  hence  their  detection 


requires  only  local  information. 


•  The  classification  of  the  analog  vector  generated  from  the  corner  detecting  process  is 
accomplished  using  the  k-means  clustering  technique.  A  database  of  clusters  is  maintained  and 
updated  as  required. 

•  An  information  theoretic  formalism  developed  allows  us  to  analyze  neuromorphic  processing 
occurring  in  different  nural  architectures. 

•  The  rate  of  information  refinement  as  a  function  of  local  fan-in,  resolution  of  the 
components,  and  number  of  neural  processing  layers  has  been  developed. 
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